NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Learning from the unknown: exploring the range of bacterial functionality

https://doi.org/10.1093/nar/gkad757

Mahlich, Yannick; Zhu, Chengsheng; Chung, Henri; Velaga, Pavan K.; De Paolis Kaluza, M. Clara; Radivojac, Predrag; Friedberg, Iddo; Bromberg, Yana (September 2023, Nucleic Acids Research)

Abstract Determining the repertoire of a microbe's molecular functions is a central question in microbial biology. Modern techniques achieve this goal by comparing microbial genetic material against reference databases of functionally annotated genes/proteins or known taxonomic markers such as 16S rRNA. Here, we describe a novel approach to exploring bacterial functional repertoires without reference databases. Our Fusion scheme establishes functional relationships between bacteria and assigns organisms to Fusion-taxa that differ from otherwise defined taxonomic clades. Three key findings of our work stand out. First, bacterial functional comparisons outperform marker genes in assigning taxonomic clades. Fusion profiles are also better for this task than other functional annotation schemes. Second, Fusion-taxa are robust to addition of novel organisms and are, arguably, able to capture the environment-driven bacterial diversity. Finally, our alignment-free nucleic acid-based Siamese Neural Network model, created using Fusion functions, enables finding shared functionality of very distant, possibly structurally different, microbial homologs. Our work can thus help annotate functional repertoires of bacterial organisms and further guide our understanding of microbial communities.
more » « less
Computational Approaches for Unraveling the Effects of Variation in the Human Genome and Microbiome

https://doi.org/10.1146/annurev-biodatasci-030320-041014

Zhu, Chengsheng; Miller, Maximilian; Zeng, Zishuo; Wang, Yanran; Mahlich, Yannick; Aptekmann, Ariel; Bromberg, Yana (July 2020, Annual Review of Biomedical Data Science)

The past two decades of analytical efforts have highlighted how much more remains to be learned about the human genome and, particularly, its complex involvement in promoting disease development and progression. While numerous computational tools exist for the assessment of the functional and pathogenic effects of genome variants, their precision is far from satisfactory, particularly for clinical use. Accumulating evidence also suggests that the human microbiome's interaction with the human genome plays a critical role in determining health and disease states. While numerous microbial taxonomic groups and molecular functions of the human microbiome have been associated with disease, the reproducibility of these findings is lacking. The human microbiome–genome interaction in healthy individuals is even less well understood. This review summarizes the available computational methods built to analyze the effect of variation in the human genome and microbiome. We address the applicability and precision of these methods across their possible uses. We also briefly discuss the exciting, necessary, and now possible integration of the two types of data to improve the understanding of pathogenicity mechanisms.
more » « less
Full Text Available
Snow microbiome functional analyses reveal novel aspects of microbial metabolism of complex organic compounds

https://doi.org/10.1002/mbo3.1100

Zhu, Chengsheng; Miller, Maximilian; Lusskin, Nicholas; Bergk_Pinto, Benoît; Maccario, Lorrie; Häggblom, Max; Vogel, Timothy; Larose, Catherine; Bromberg, Yana (September 2020, MicrobiologyOpen)

Abstract Microbes active in extreme cold are not as well explored as those of other extreme environments. Studies have revealed a substantial microbial diversity and identified cold‐specific microbiome molecular functions. We analyzed the metagenomes and metatranscriptomes of 20 snow samples collected in early and late spring in Svalbard, Norway using mi‐faser, our read‐based computational microbiome function annotation tool. Our results reveal a more diverse microbiome functional capacity and activity in the early‐ vs. late‐spring samples. We also find that functional dissimilarity between the same‐sample metagenomes and metatranscriptomes is significantly higher in early than late spring samples. These findings suggest that early spring samples may contain a larger fraction of DNA of dormant (or dead) organisms, while late spring samples reflect a new, metabolically active community. We further show that the abundance of sequencing reads mapping to the fatty acid synthesis‐related microbial pathways in late spring metagenomes and metatranscriptomes is significantly correlated with the organic acid levels measured in these samples. Similarly, the organic acid levels correlate with the pathway read abundances of geraniol degradation and inversely correlate with those of styrene degradation, suggesting a possible nutrient change. Our study thus highlights the activity of microbial degradation pathways of complex organic compounds previously unreported at low temperatures.
more » « less
Full Text Available
Fingerprinting cities: differentiating subway microbiome functionality

https://doi.org/10.1186/s13062-019-0252-y

Zhu, Chengsheng; Miller, Maximilian; Lusskin, Nick; Mahlich, Yannick; Wang, Yanran; Zeng, Zishuo; Bromberg, Yana (December 2019, Biology Direct)

Abstract BackgroundAccumulating evidence suggests that the human microbiome impacts individual and public health. City subway systems are human-dense environments, where passengers often exchange microbes. The MetaSUB project participants collected samples from subway surfaces in different cities and performed metagenomic sequencing. Previous studies focused on taxonomic composition of these microbiomes and no explicit functional analysis had been done till now. ResultsAs a part of the 2018 CAMDA challenge, we functionally profiled the available ~ 400 subway metagenomes and built predictor for city origin. In cross-validation, our model reached 81% accuracy when only the top-ranked city assignment was considered and 95% accuracy if the second city was taken into account as well. Notably, this performance was only achievable if the similarity of distribution of cities in the training and testing sets was similar. To assure that our methods are applicable without such biased assumptions we balanced our training data to account for all represented cities equally well. After balancing, the performance of our method was slightly lower (76/94%, respectively, for one or two top ranked cities), but still consistently high. Here we attained an added benefit of independence of training set city representation. In testing, our unbalanced model thus reached (an over-estimated) performance of 90/97%, while our balanced model was at a more reliable 63/90% accuracy. While, by definition of our model, we were not able to predict the microbiome origins previously unseen, our balanced model correctly judged them to be NOT-from-training-cities over 80% of the time.Our function-based outlook on microbiomes also allowed us to note similarities between both regionally close and far-away cities. Curiously, we identified the depletion in mycobacterial functions as a signature of cities in New Zealand, while photosynthesis related functions fingerprinted New York, Porto and Tokyo. ConclusionsWe demonstrated the power of our high-speed function annotation method,mi-faser,by analysing ~ 400 shotgun metagenomes in 2 days, with the results recapitulating functional signals of different city subway microbiomes. We also showed the importance of balanced data in avoiding over-estimated performance. Our results revealed similarities between both geographically close (Ofa and Ilorin) and distant (Boston and Porto, Lisbon and New York) city subway microbiomes. The photosynthesis related functional signatures of NYC were previously unseen in taxonomy studies, highlighting the strength of functional analysis.
more » « less
Full Text Available
fusionDB: assessing microbial diversity and environmental preferences via functional similarity networks

https://doi.org/10.1093/nar/gkx1060

Zhu, Chengsheng; Mahlich, Yannick; Miller, Maximilian; Bromberg, Yana (November 2017, Nucleic Acids Research)

Full Text Available
clubber: removing the bioinformatics bottleneck in big data analyses

https://doi.org/10.1515/jib-2017-0020

Miller, Maximilian; Zhu, Chengsheng; Bromberg, Yana (June 2017, Journal of Integrative Bioinformatics)

Abstract With the advent of modern day high-throughput technologies, the bottleneck in biological discovery has shifted from the cost of doing experiments to that of analyzing results.clubberis our automated cluster-load balancing system developed for optimizing these “big data” analyses. Its plug-and-play framework encourages re-use of existing solutions for bioinformatics problems.clubber’s goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC) resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API, allowing for job monitoring and result retrieval. We usedclubberto speed up our pipeline for annotating molecular functionality of metagenomes. Here, we analyzed the Deepwater Horizon oil-spill study data to quantitatively show that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 min) clearly illustrate the importance ofclubberin the everyday computational biology environment.
more » « less
Full Text Available
Functional sequencing read annotation for high precision microbiome analysis

https://doi.org/10.1093/nar/gkx1209

Zhu, Chengsheng; Miller, Maximilian; Marpaka, Srinayani; Vaysberg, Pavel; Rühlemann, Malte C; Wu, Guojun; Heinsen, Femke-Anouska; Tempel, Marie; Zhao, Liping; Lieb, Wolfgang; et al (November 2017, Nucleic Acids Research)

Full Text Available

Search for: All records